XGBoost eXplainable AI¶
[1]:
import shap
import pandas as pd
from src.models import retrieve_fit_model as rfm
Retrieving latest most accurate XGBoost fit model¶
[2]:
fit_xgb_model = rfm.get_fit_mlflow_model('xgb')
Calculating Shapley values for fit XGBoost model¶
[3]:
def get_fit_model_shapley_values_and_explainer(fit_xgb_model):
"""Return a tuple with a list containing computed Shapley values from fit XGBoost model
and the obtained TreeExplainer.
Keyword arguments:
fit_xgb_model -- Fitted XGBoost model
"""
explainer = shap.TreeExplainer(fit_xgb_model)
data_for_prediction = pd.read_csv('../../data/processed/processed_application_test.csv')
shap_values = explainer.shap_values(data_for_prediction)
return shap_values, explainer
[4]:
shap_values, explainer = get_fit_model_shapley_values_and_explainer(fit_xgb_model)
ntree_limit is deprecated, use `iteration_range` or model slicing instead.
Showing explainer base value¶
[5]:
explainer.expected_value
[5]:
-2.649124
[6]:
shap_values.shape
[6]:
(10468, 236)
Choosing line index to get explanations from¶
[7]:
line_index = 10
Extracting test dataset line from index¶
[8]:
test = pd.read_csv('../../data/processed/processed_application_test.csv')
line = test.iloc[line_index]
Vizualizing explanations for a single line in test dataset¶
[9]:
shap.initjs()
shap.force_plot(explainer.expected_value, shap_values[line_index], line)
[9]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
Vizualizing explanations for all lines in test dataset at once (subsample at 1000 lines)¶
[10]:
shap.force_plot(explainer.expected_value, shap_values[:1000], test.sample(1000))
[10]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.